Search CORE

125 research outputs found

Distributed Tree Kernels

Author: Dell'Arciprete Lorenzo
Zanzotto Fabio Massimo
Publication venue
Publication date: 01/01/2012
Field of study

In this paper, we propose the distributed tree kernels (DTK) as a novel method to reduce time and space complexity of tree kernels. Using a linear complexity algorithm to compute vectors for trees, we embed feature spaces of tree fragments in low-dimensional spaces where the kernel computation is directly done with dot product. We show that DTKs are faster, correlate with tree kernels, and obtain a statistically similar performance in two natural language processing tasks.Comment: ICML201

arXiv.org e-Print Archive

ART

Parsing with CYK over Distributed Representations

Author: Cristini Giordano
Satta Giorgio
Zanzotto Fabio Massimo
Publication venue
Publication date: 17/04/2019
Field of study

Syntactic parsing is a key task in natural language processing. This task has been dominated by symbolic, grammar-based parsers. Neural networks, with their distributed representations, are challenging these methods. In this article we show that existing symbolic parsing algorithms can cross the border and be entirely formulated over distributed representations. To this end we introduce a version of the traditional Cocke-Younger-Kasami (CYK) algorithm, called D-CYK, which is entirely defined over distributed representations. Our D-CYK uses matrix multiplication on real number matrices of size independent of the length of the input string. These operations are compatible with traditional neural networks. Experiments show that our D-CYK approximates the original CYK algorithm. By showing that CYK can be entirely performed on distributed representations, we open the way to the definition of recurrent layers of CYK-informed neural networks.Comment: The algorithm has been greatly improved. Experiments have been redesigne

arXiv.org e-Print Archive

ART

Archivio istituzionale della ricerca - Università di Padova

Empowering Multi-step Reasoning across Languages via Tree-of-Thoughts

Author: Ranaldi Leonardo
Zanzotto Fabio Massimo
Publication venue
Publication date: 14/11/2023
Field of study

Chain-of-Thought (CoT) prompting empowers the reasoning abilities of Large Language Models (LLMs), eliciting them to solve complex reasoning tasks step-by-step. However, with the success of CoT methods, the ability to deliver multi-step reasoning remains limited to English due to the imbalance in the distribution of the pre-training data, making the other languages a barrier. In this work, we propose a Cross-lingual multi-step reasoning approach, aiming to align reasoning processes across different languages. In particular, our method, through a Self-consistent Cross-lingual prompting mechanism inspired by the Tree-of-Thoughts approach, delivers multi-step reasoning paths in different languages that, during the steps, lead to the final solution. Our experimental evaluations show that our method significantly outperforms existing prompting methods, reducing the number of interactions and achieving state-of-the-art performance

arXiv.org e-Print Archive

HANS, are you clever? Clever Hans Effect Analysis of Neural Systems

Author: Ranaldi Leonardo
Zanzotto Fabio Massimo
Publication venue
Publication date: 21/09/2023
Field of study

Instruction-tuned Large Language Models (It-LLMs) have been exhibiting outstanding abilities to reason around cognitive states, intentions, and reactions of all people involved, letting humans guide and comprehend day-to-day social interactions effectively. In fact, several multiple-choice questions (MCQ) benchmarks have been proposed to construct solid assessments of the models' abilities. However, earlier works are demonstrating the presence of inherent "order bias" in It-LLMs, posing challenges to the appropriate evaluation. In this paper, we investigate It-LLMs' resilience abilities towards a series of probing tests using four MCQ benchmarks. Introducing adversarial examples, we show a significant performance gap, mainly when varying the order of the choices, which reveals a selection bias and brings into discussion reasoning abilities. Following a correlation between first positions and model choices due to positional bias, we hypothesized the presence of structural heuristics in the decision-making process of the It-LLMs, strengthened by including significant examples in few-shot scenarios. Finally, by using the Chain-of-Thought (CoT) technique, we elicit the model to reason and mitigate the bias by obtaining more robust models

arXiv.org e-Print Archive

Distributed Smoothed Tree Kernel

Author: Ferrone Lorenzo
Zanzotto Fabio Massimo
Publication venue: 'OpenEdition'
Publication date: 15/12/2020
Field of study

In this paper we explore the possibility to merge the world of Compositional Distributional Semantic Models (CDSM) with Tree Kernels (TK). In particular, we will introduce a specific tree kernel (smoothed tree kernel, or STK) and then show that is possibile to approximate such kernel with the dot product of two vectors obtained compositionally from the sentences, creating in such a way a new CDSM

OpenEdition

syntnn at semeval 2018 task 2 is syntax useful for emoji prediction embedding syntactic trees in multi layer perceptrons

Author: Andrea Santilli
Fabio Massimo Zanzotto
Publication venue
Publication date: 01/01/2018
Field of study

Crossref

ART

Open Access Repository

Archivio della ricerca- Università di Roma La Sapienza

Risk Assessment for Venous Thromboembolism in Chemotherapy-Treated Ambulatory Cancer Patients: A Machine Learning Approach

Author: Ferroni Patrizia
Guadagni Fiorella
Nanni Umberto
Riondino Silvia
Roselli Mario
Scarpato Noemi
Zanzotto Fabio Massimo
Publication venue: 'SAGE Publications'
Publication date: 01/01/2017
Field of study

OBJECTIVE: To design a precision medicine approach aimed at exploiting significant patterns in data, in order to produce venous thromboembolism (VTE) risk predictors for cancer outpatients that might be of advantage over the currently recommended model (Khorana score). DESIGN: Multiple kernel learning (MKL) based on support vector machines and random optimization (RO) models were used to produce VTE risk predictors (referred to as machine learning [ML]-RO) yielding the best classification performance over a training (3-fold cross-validation) and testing set. RESULTS: Attributes of the patient data set ( n = 1179) were clustered into 9 groups according to clinical significance. Our analysis produced 6 ML-RO models in the training set, which yielded better likelihood ratios (LRs) than baseline models. Of interest, the most significant LRs were observed in 2 ML-RO approaches not including the Khorana score (ML-RO-2: positive likelihood ratio [+LR] = 1.68, negative likelihood ratio [-LR] = 0.24; ML-RO-3: +LR = 1.64, -LR = 0.37). The enhanced performance of ML-RO approaches over the Khorana score was further confirmed by the analysis of the areas under the Precision-Recall curve (AUCPR), and the approaches were superior in the ML-RO approaches (best performances: ML-RO-2: AUCPR = 0.212; ML-RO-3-K: AUCPR = 0.146) compared with the Khorana score (AUCPR = 0.096). Of interest, the best-fitting model was ML-RO-2, in which blood lipids and body mass index/performance status retained the strongest weights, with a weaker association with tumor site/stage and drugs. CONCLUSIONS: Although the monocentric validation of the presented predictors might represent a limitation, these results demonstrate that a model based on MKL and RO may represent a novel methodological approach to derive VTE risk classifiers. Moreover, this study highlights the advantages of optimizing the relative importance of groups of clinical attributes in the selection of VTE risk predictors

Archivio della ricerca- Università di Roma La Sapienza

A Trip Towards Fairness: Bias and De-Biasing in Large Language Models

Author: Onorati Dario
Ranaldi Leonardo
Ruzzetti Elena Sofia
Venditti Davide
Zanzotto Fabio Massimo
Publication venue
Publication date: 23/05/2023
Field of study

An outbreak in the popularity of transformer-based Language Models (such as GPT (Brown et al., 2020) and PaLM (Chowdhery et al., 2022)) has opened the doors to new Machine Learning applications. In particular, in Natural Language Processing and how pre-training from large text, corpora is essential in achieving remarkable results in downstream tasks. However, these Language Models seem to have inherent biases toward certain demographics reflected in their training data. While research has attempted to mitigate this problem, existing methods either fail to remove bias altogether, degrade performance, or are expensive. This paper examines the bias produced by promising Language Models when varying parameters and pre-training data. Finally, we propose a de-biasing technique that produces robust de-bias models that maintain performance on downstream tasks

arXiv.org e-Print Archive

Senso Comune as a Knowledge Base of Italian language: The Resource and its Development

Author: Caselli Tommaso
Chiari Isabella
Gangemi Aldo
Jezek Elisabetta
Massimo Zanzotto Fabio
Oltramari Alessandro
Vetere Guido
Vieu Laure
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

International audienceSenso Comune is a linguistic knowledge base for the Italian Language, which accommodates the content of a legacy dictionary in a rich formal model. The model is implemented in a platform which allows a community of contributors to enrich the resource. We provide here an overview of the main project features, including the lexical-ontology model, the process of sense classification, and the annotation of meaning definitions (glosses) and lexicographic examples. Also, we will illustrate the latest work of alignment with MultiWordNet, to illustrate the methodologies that have been experimented with, to share some preliminary result, and to highlight some remarkable findings about the semantic coverage of the two resources

Scientific Publications of the University of Toulouse II Le Mirail

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Open Archive Toulouse Archive Ouverte

ART

Archivio della ricerca- Università di Roma La Sapienza